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>. Abstract 

There  are  many  articles  discussing  the  solution  of  boundary  value  prob¬ 
lems  on  various  parallel  machines.  The  solution  of  initial  value  problems 
does  not  lend  itself  to  parallelism,  since  in  this  case  one  uses  methods  that 
are  sequential  in  nature. 

Here  we  develop  a  parallel  scheme  for  initial  value  problems  based  on 
the  box  scheme  and  a  modified  recursive  doubling  technique. 

Fully  implicit  Runge  Kutta  methods  were  discussed  by  Jackson  and 
Norsett  (1986)  and  Lie  (1987).  Lie  assumes  that  each  processor  of  the 
parallel  computer  having  vector  capabilities.  /  ,  .  "] 

(  '  ( - - 

Introduction 


We  consider  the  solution  of  linear  initial  value  problems  on  a  hypercube.  “By  a 
hypercube  we  intend  a  distributed  memory  MIMD  computer  with  communica¬ 
tion  between  processors  ...  via  a  communication  network  having  the  topology  of 
a  p-dimensional  cube,  with  the  vertices  considered  as  processors  and  the  edges 
as  communication  links”  (Keller  and  Nelson,  1987).  See  also  Fox  (1984,  1985, 
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1987)  and  Fox  and  Otto  (1984).  Our  method  of  solution  is  based  on  the  box 
scheme  to  discretize  the  system  of  initial  value  problems 

y'  =  Ay  +  f(x) 

y(a)  =  Vo 

where  y  and  /  are  n-dimensional  vectors  and  A  is  an  nx  n  matrix.  The  resulting 
system  of  equations  is  solved  by  a  modified  version  of  the  recursive  doubling 
technique  (see  Stone,  1973). 

In  the  next  section,  the  discretization  is  described  and  the  resulting  system 
of  equation  is  given.  Section  3  will  describe  the  modified  recursive  doubling 
technique  and  its  application  to  our  system. 

It  will  be  interesting  to  experiment  with  the  method  and  compare  the  results 
to  a  sequential  initial  value  solver  of  the  same  order. 


2  The  Single  Step  Method 


Consider  the  system  of  initial  value  problems 

y1  =  A(x)y  +  f(x),  a<x<b 


yfa)  =  yo 

where 

y  =  (yi,-,yn)T,  f  =  (fi(x),...,fn(x))T, 
y'o  =  (yio.---,yno). 

and 

A  =  Oij{ i),  1  <  j  <  n. 


Let 


x  j  =  a  +  jh, 


j  =  0,1,..., m 


where 


h  = 


b  -  a 
m 


(1) 


(2) 

(3) 
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be  a  uniform  mesh.  The  box  scheme  (see  e.g.  Keller,  1976),  applied  to  (1)  yields 

(4) 

yo  =  yo 

where 


yj+i 

=  Vj  +  * 

yo 

=  yo 

^i+3 

=  A 

=  /l 

and  xjj  is  the  approximation  to  y(xj). 

Let  i  =  l,2,...,s}  be  a  strictly  increasing  sequence  such  that  ji  >  0 
and  js  =  m.  We  shall  compute  the  solution  at  the  points  x,  =  x;-..  Let  be 
n  x  n  matrices  defined  for  each  i 

$«  =  ^ ~  2  *  =  -  (®) 

where  jo  =  0  and  h  is  sufficiently  small  so  that  I  —  %  AJ+ 1  are  nonsingular. 
Similarly  let  the  n-vector  <pi  be 

(/+£ Vj) 

H^m) 


-1 


(6) 


+h 


1  -  1)2, ...  ,s, 


where 


yo  =  0 


(7) 


and 

yy+i  =(I~\  •+*+*.,)  [(/+  w  +  */;+§+;, -i 

3  =  -  ji-i  -  2. 

Then  it  can  be  easily  shown  as  in  Keller  and  Nelson  (1987),  that 
Yii  =  Vii-i  +Vi,  »  =  1.2 . «• 


(8) 


>ion  For 

(9) 


L)i  ■r.’.ritut.i  on/ 

.A vail" l- 1  lily  Codes 
H  o.r 

blst  ,  Jial 


1  oggi 


Remarks 


1.  The  matrices  to  be  inverted  are  of  order  n,  the  number  of  equations  in  the 
original  system  (1). 

2.  The  last  factor  in  the  product  defining  is  the  matrix  required  in  com¬ 
puting  <pi. 

3.  The  vector  can  be  computed  by  (7)  -  (8)  in  the  same  loop  one 

computes  since  it  requires  the  same  matrices. 


3  Parallel  Evaluation 


To  solve  (9)  on  a  hypercube  with  p  =  s  processors,  one  can  modify  the  recursive 
doubling  technique  developed  by  Stone  (1973). 

Let 


bi  =  $iyo  +  P\ 


(10) 


b]  —  Pji  j  —  2, 3 , . . .  ,s 

and  let  Y,(j)  be  a  function  of  bj,bj-i, . . . , 6J_,+i, . . .  ,$,■_<+!.  Then  the  fol¬ 
lowing  results  can  be  proved  using  similar  arguments  as  in  Stone(1973). 
Theorem.  Let  Yi(j)  satisfy  the  recurrence  relation 


*i+ i(i)  =  YiU)  +  ~  1),  i,j  >  1  (11) 


with  boundary  conditions 

Yi(j)  =  bj,  j>  1 
Yi{j)  =  0,  j  <  0  or  i  <  0. 

Then 

(i) 

Yi+,(j)  =  Y,(j)  +  n  Qv-k-'+iW  -  s) 

k=i - »+ 1 


(12) 
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(13) 


=  n  Qj-s+k+i  >  n(*),  i>3>  1,  (14) 

*=1  li=*+l  1 


(iii)  for 


Corollary 


«>J>1,  y,(j)  =  yjt. 


Yu{j)  =  Yi{j)  +  <  II  *ay-*-<+i  )  Yi{j  -  i ),  ij  >1  (16) 

U=y-«+i  J 

This  corollary  provides  the  recursive  doubling  algorithm  for  the  solution  of 
(9).  Let 


Mi(j)  = 


then  (16)  can  be  written  as 


II  $j-k+ 1  j  <  i 

k=  1 

i 

II  $2j-k+l-i,  j  >  » 


V2,U)  =  Yi(j)  +  Mi(j)Yi(j  -  i)  i,j  >  1 


M2i(j)  =  -  t) 


*,3  >  1 


with  boundary  conditions 


Mlti)  =  $;>  3  >  1 

Mi  (j)  =  I,  *  <  0  or  j  <  0. 

We  are  now  ready  to  state  the  algorithm. 

Algorithm 

For  »’  =  1  to  s/2  in  steps  of »  do: 

Ytiti)  =  YiU)  +  Mi(j)Yi(j  -  »)  i  <  j  <  s 


M2i(j)  =  Mi(j)Mi(j  -  i) 


»  <  j  <  s 
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Next  i. 

From  our  theorem,  Y, (j)  =  for  1  <  j  <  s,  so  that  Y  s  is  the  solution  of  (9). 
We  note  that  for  each  i,  the  indices  pertaining  to  j  are  executed  simultaneously 
on  s  processors.  Since  i  doubles  during  each  iteration,  log2  s  iterations  are 
required  for  computation. 
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